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Summary. - The Middle East syndrome coronavirus (MERS-CoV) is a recently emerging betacoronavirus 
with high fatality. Recently, dipeptidyle peptidase (CD26, DPP4) was identified as the host cell receptor for 
MERS-CoV. Interestingly, despite of common presence of DPP4 receptors the binding and infection of vari¬ 
ous cells shows imminent variability. In this report, we provide a tool for prediction of the host tropism of the 
virus based on the host receptor binding interface. We found out that, in the binding of MERS-CoV to cells the 
amino acid residues in lancets 4 and 5 of DPP4 receptor, namely K267, Q286, T288, R317, R336, Q344 A291, 

L294, and 1295 are involved. Changes in these residues correspond to profound decrease in virus binding to 
cells. The nine residues at the interface between the virus spikes and the lancets 4 and 5 of host DPP4 can be 
used as a predictive tool for the host tropism and virus affinity to host cell receptors. 
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Introduction 

Coronaviruses (CoV) are enveloped single-stranded 
RNA viruses that infect human and wide variety of animals 
causing severe respiratory or enteric symptoms (Chang et 
al, 2012; Perlman and Netland, 2009). CoVs that infect 
humans include human alfa, beta and gamma CoV. Severe 
acute respiratory syndrome (SARS) has been associated with 
Betacornovirus genus. 

CoVs are characterized by high recombination frequen¬ 
cies. The large size of virus genome, unique viral replication, 
the low fidelity of coronavirus-encoded polymerases and 
high recombination accounts for unexpected viral evolu¬ 
tion of other host infection, changes in clinical signs and 
resistance to therapy or vaccination. The human SARS-CoV 
OC43 has evolved from bovine CoV. Furthermore, porcine 
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respiratory CoV has evolved from a gastrointestinal ancestor 
(Chang et al, 2012; Laude et al., 1993). 

MERS-CoV was initially identified in the Arabian Penin¬ 
sula in 2012. MERS-CoV was assigned to the Betacoronavi¬ 
rus genus. The genome of CoV encodes 4 major structural 
proteins; nucleocapsid (N protein), spikes (S protein), mem¬ 
brane (M) and small envelope proteins (E). The S protein is 
a glycoprotein essential for viral attachment to the cell surface 
receptors. The S protein is cleaved in host cells into SI and S2 
subunits. SI protein binds the host receptor, while S2 recep¬ 
tor mediates membrane fusion (Wang et al, 2013). 

The virus replicates in different hosts using DPP4 as 
a functional receptor (Ohnuma et al, 2013). DPP4 is proved 
to be the only essential receptor for MERS-CoV spikes bind¬ 
ing to host cells. Therefore, DPP4 constitutes a unique binding 
site for MERS-CoV which differs from the binding receptor 
of SARS-CoV (ACE2, (Wang et al, 2013)). Interestingly, the 
presence of DPP4 receptor in a host cell does not warrant the 
binding and infection with the MERS-CoV. For instance, 
MERS-CoV can replicate in cells of humans, pigs, rabbits and 
non-human primates (Chan et al, 2013). In contrast, despite 
of the expression of DPP4, replication of MERS-CoV was not 
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possible in hamsters (de Wit et al, 2013) and ferrets (Raj et 
al, 2013). Furthermore, transfection of ferret kidney cells 
with functional human DPP4 receptors rendered the cells 
susceptible to infection with MERS-CoV (Raj et al, 2013). 

In this study, bioinformatics approaches combined with 
the predetermined critical factors for binding and infection 
with MERS-CoV were combined to predict the host tropism 
of the newly emerged MERS-CoV. Here, we identify a group 
of amino acids at virus-host receptor interface as a marker for 
efficient binding and subsequent infection with MERS-CoV. 
The provided predictive tool can be used to predict the host 
tropism of MERS-CoV in the surrounding environment of 
an infected region. 

Materials and Methods 

The full-length DPP4 sequences were retrieved from the nu¬ 
cleotides depository at the National Center for Biotechnology 
Information (NCBI). The potential protein domain contents of 
the retrieved sequences were analyzed at the domain search tools 
at NCBI. BLAST search was used to identify the high homology 
hits. The retrieved sequences of various species were exported 
and analyzed for differences in amino acid composition. Multiple 
sequence alignment was constructed by using Clustal omega tool 
at the European Bioinformatics Institute (EBI). The output file was 
retrieved and manually edited by GeneDoc software. The similarity 
and homology percentage was calculated by Ugene 1.12.2 for mac 
computer. The output alignments from Clustal omega were exam¬ 
ined by Dendroscope software for creation of phylogenetic trees. 
The phylogenetic tree was created by neighbor-joining method in 
output formats of radial phylogram or circular cladogram. 

Results and Discussion 

The bioinformatics of proteins is a useful strategy in 
determining the biomolecular interactions (Kandeel, 2014; 
Kandeel and Kitade, 2013a,b), host restrictions of virus infec¬ 
tion (Tonnessen et al., 2013) and viral evolution (Cui et al, 
2013; Liu et al, 2012). In this report, the bioinformatic tools 
were adopted to predict the MERS-CoV host tropism. The 
criteria of prediction were based on: first, the alignment of 
protein sequences of lancets 4 and 5 of DPP4 as well as the 
phylogenetic relations with the human DPP4 (Fig. laandb). 
The second is the alignment of amino acid sequences at the 
interface of interaction (Fig. lc) between the lancets 4 and 5 
of DPP4 and the spike protein of MERS-CoV. 

The structure of DPP4 shows N-terminal hydrolase and 
C-terminal (3-propeller domain composed of 8 lancets. 
Lancets 4 and 5 were found to be the site for binding of the 
S protein of MERS-CoV. Replacement of the lancets 4 and 


5 or mutational changes led to drastic effect on the binding 
and infection with MERS-CoV. 

Phylogentic analysis of lancets 4 and 5 of DPP4 showed 
that non-human primates (more than 98% homology, 
Table 1) and rabbits are highly related to the human DPP4 
followed by bats and rodents as guinea pig, hamster and 
rat. The most divergent DPP4 was that of birds and alligator 
(homology was less than 60%, Table 1). 

The binding interface between S protein and DPP4 is 
composed of polar contacts from hydrophilic residues 
K267, Q286, T288, R317, R336 and Q344 surrounding 
a hydrophobic center formed by A291, L294 and 1295 
(Fig. lc). Disruption of the mentioned residue interaction 
resulted in profound decrease of virus entry (Wang et al, 
2013). In this report, we assume that MERS-CoV binding 
and replication in a specific host depends on the status of 
the above mentioned nine residues. The high replication of 
MERS-CoV in non-human primates coincides with conser¬ 
vation of all of the above mentioned residues. Rabbits and 
to a lesser extent pigs showed high residue conservation 
pattern. Therefore, infection with MERS-CoV was possible 
in cells of these animals. Camelids showed a high conserva¬ 
tion profile, indicating a potential incrimination of camels as 
a host for the virus. In this context, neutralizing antibodies 
against MERS-CoV were detected in camels from Middle 
East (Perera et al, 2013; Reusken et al, 2013). Similar high 
conservation pattern was evident in the sequences from farm 
animals as sheep, goats and cattle. Although bats were highly 
divergent from human DPP4, they showed little changes in 
the described 9 residues. This clarifies the possible role of 
bats in the transfer of MERS-CoV. However, an estimated 
MERS-CoV bat-infection rate was at least 3 folds lower than 
that of SARS-CoV (Memish et al, 2013). Cats and rodents 
showed amino acid replacements in at least half of the above 
mentioned residues. This may explain the low viral load 
in their cell cultures compared with primates (Chan et al, 
2013). Compared to the human DPP4, birds showed the 
highest divergence (Fig. lb). Furthermore, birds showed 
the greatest changes in the above described marker residues 
(Fig. la). In agreement with our assumption, chicken-derived 
cell culture did not support the replication of MERS-CoV 
(Chan et al, 2013). 

In brief, the resistance to MERS-CoV replication was as¬ 
sociated with significant changes in the amino acid residues 
at the interface of interaction between S protein and lancets 
4 and 5 of DPP4. Experimental measures are needed for 
confirmation of our predictive model. The degree of con¬ 
servation of the above mentioned residues can be used to 
predict the host tropism of MERS-CoV. These predictions 
might be of a value in prevention and control programs, in 
which the sensitivity and resistance to MERS-CoV infection 
in the surrounding environment can be anticipated. 
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Fig. 1 

(a) Amino acid sequence alignment of the lancets 4 and 5 of p-propeller domain of DPP4 in different species, (b) phylogenetic tree of various 
animal species based on amino acid sequence of DPP4, (c) structure of MERS-CoV complexed with human DPP4 

The receptor binding domain of MERS-CoV spike (green), extracellular part of DPP4 (turquoise) and interaction interface (white) are shown. 
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Table 1. Characteristics of amino acid sequences of DPP4 in different species 


Scientific name 

Common name 

Total score 

Query cover 

E value 

Identity % 

Acc. No. 

Homo sapiens 

Humans 

197 

100% 

4.00E-57 

100% 

CAA43118.1 

Nomascus leucogenys 

White-cheeked gibbon 

195 

100% 

2.00E-56 

99% 

XP_003266219.1 

Gorilla gorilla 

Gorilla 

193 

100% 

8.00E-56 

98% 

XP_004032754.1 

Macaca mulatta 

Rhesus monkey 

193 

100% 

9.00E-56 

98% 

NP_001034279.1 

Macaca fascicularis 

Long-tailed Macaque 

193 

100% 

9.00E-56 

98% 

XP_005573375.1 

Papio anubis 

Baboon 

193 

100% 

1.00E-55 

98% 

XP_003907588.1 

Pan troglodytes 

Chimpanzee 

193 

100% 

1.00E-55 

99% 

XP_515858.2 

Oryctolagus cuniculus 

Rabbit 

188 

100% 

5.00E-54 

94% 

XP_002712206.1 

Equus caballus 

Horse 

179 

100% 

7.00E-51 

88% 

XP_005601601.1 

Ceratotherium simum 

Rhinoceros 

175 

100% 

2.00E-49 

87% 

XP_004428321.1 

Loxodonta Africana 

Elephant 

174 

100% 

2.00E-48 

85% 

XP_003406047.1 

Trichechus manatus 

See cow 

164 

100% 

2.00E-45 

81% 

XP_004375482.1 

Bos Taurus 

Cow 

155 

100% 

6.00E-44 

77% 

DAA32742.1 

Cavia porcellus 

Guinea pig 

160 

100% 

1.00E-43 

80% 

XP_003478612.2 

Capra hircus 

Goat 

155 

100% 

1.00E-42 

77% 

XP_005676104.1 

Cricetulus griseus 

Hamster 

154 

100% 

1.00E-42 

73% 

EGW01899.1 

Myotis lucifugus 

Brown bat 

156 

98% 

2.00E-42 

78% 

XP_006083275.1 

Ovis aries 

Sheep 

156 

100% 

2.00E-42 

77% 

XP_004004709.1 

Dasypus novemcinctus 

Armadillo 

154 

100% 

6.00E-42 

76% 

XP_004464464.1 

Camelus ferus 

Camel 

153 

100% 

2.00E-41 

75% 

XP_006176870.1 

Myotis brandtii 

Vesper bat 

153 

98% 

3.00E-41 

77% 

EPQ03437.1 

Pipistrellus pipistrellus 

Common pipistrelle bat 

151 

98% 

2.00E-40 

75% 

AGF80256.1 

Sus scrofa 

Pig 

148 

98% 

2.00E-39 

74% 

NP_999422.1 

Orcinus orca 

Killer whale 

147 

100% 

3.00E-39 

73% 

XP_004283669.1 

Felis catus 

Cat 

146 

100% 

8.00E-39 

71% 

NP_001009838.1 

Ailuropoda melanoleuca 

Panda 

144 

100% 

9.00E-38 

70% 

XP_002924912.1 

Mustela putorius furo 

Ferret 

130 

100% 

5.00E-33 

63% 

ABC72084.1 

Columba livia 

Pigeon 

107 

98% 

1.00E-24 

55% 

XP_005498754.1 

Falco cherrug 

Falcon 

105 

98% 

4.00E-24 

51% 

XP_005443040.1 

Ovophis okinavensis 

Pit viper 

104 

100% 

6.00E-24 

54% 

BAN82157.1 

Alligator sinensis 

Alligator 

100 

98% 

1.00E-22 

52% 

XP_006037514.1 

Pseudopodoces humilis 

Ground tit 

100 

98% 

1.00E-22 

51% 

XP_005520053.1 

Taeniopygia guttata 

Zebra finch 

99 

98% 

5.00E-22 

50% 

XP_004176799.1 

Gallus gallus 

Fowl 

94 

82% 

4.00E-20 

56% 

NP 001026426.1 
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